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Abstract. The main goal of our work is to formally prove the cor- 
rectness of the key commands of the SCHUR software, an interactive 
program for calculating with characters of Lie groups and symmetric 
functions. The core of the computations relies on enumeration and ma- 
nipulation of combinatorial structures. As a first "proof of concept", we 
present a formal proof of the conjugate function, written in C. This 
function computes the conjugate of an integer partition. To formally 
prove this program, we use the Frama-C software. It allows us to an- 
notate C functions and to generate proof obligations, which are proved 
using several automated theorem provers. In this paper, we also draw on 
methodology, discussing on how to formally prove this kind of program. 

1 Introduction 

SCHUR [1] is an interactive software for calculating properties of Lie groups 
and symmetric functions [2] . It is used in research by combinatorists, physicists, 
theoretical chemists [3] as well as for educational purpose as a learning tool for 
students in algebraic combinatorics. One of its main uses is to state conjectures 
on combinatorial objects. For such use, it is important to have some confidence 
in the results produced by SCHUR. 

Until now, the method used to get some confidence in the results has mostly 
been based on just one example for each command. 

The computation of other examples is complex due to the well known combi- 
natorial explosion, especially when using algorithms associated to the symmetric 
group, see section 2.1. Unfortunately, the combinatorial explosion as well as com- 
puting time forbid test generation or verification techniques (model checking). 
Therefore, in this paper, we focus on formal proof of the existing program. 

With the aim of verifying the whole software, we start with proving the 
correctness of its fundamentals bricks. The main combinatorial object used in 
SCHUR is integer partition. The first non-trivial operation on integer partitions 
is the conjugate. Moreover, conjugate function is necessary for more than half of 
the 240 interactive commands of SCHUR. From this point of view, we can say 
that conjugate is a critical function of SCHUR. 
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The very first work consists in isolating (see 4.3) the code of this function 
from the program. Next, we chose to use the most popular tool of program proof 
community's Frama-C [4] successor of Caduceus. Frama-C is a plug-in system. 
In order to prove programs, we used Jessie [5], the deductive verification plug-in 
of C programs annotated with ACSL [6]. The generated verification conditions 
can be submitted to external automatic provers, and for more complex situations, 
to interactive theorem provers as well (see section 2.2). 

After a short presentation of software tools and theoretical concepts, we will 
present the formal proof of a program. Finally, after discussing difficulties and 
mistakes encountered along the way, we will propose a methodology to prove 
such a software, and finally discuss future work. 

2 Presentation of the Software Used 

2.1 The SCHUR Software 

SCHUR is an interactive software for calculating properties of Lie groups and 
symmetric functions. A Symmetric Function is a function which is symmetric, 
or invariant, under any permutation of its variables. For example f(x\,X2, £3) = 
x\ + x 2 + X3 is a symmetric function. 

SCHUR has originally written by Prof. Brian G. Wybourne in Pascal lan- 
guage. Then it was translated into C by an automatic program making it quite 
difficult to read. There are almost no comments in the code, the code is more 
than 50,000 lines long with many global variables. Local variables have names 
such as W52 and so on. 

After the death of Prof. Wybourne in November 2003, some people felt that 
his program should be maintained, and if possible enhanced, with a view to 
making it freely available to the mathematics and physics research community. 

Nowadays, it is open source under the GPL license and includes more than 
240 commands. The code still includes very few comments. Some mistakes have 
been corrected but some interactive commands are so intricate that it is difficult 
to have more than a few examples to check them against and most people do 
not even know if the result is correct or not. 

This is why we started to work on this code. Firstly some of the commands in 
SCHUR are very well implemented (for example, plethysm is computed faster by 
SCHUR than by many other combinatorial toolboxes). Formally proving some 
key functions inside would also be a major advance for its research community. 

2.2 The Frama-C Software 

Frama-C [4] is an open source extensible platform dedicated to source code analy- 
sis of C software. It is co-developed by two French public institutions: CEA-LIST 
(Software Reliability Laboratory) and INRIA-Saclay (ProVal project). 

Frama-C is a plug-in system. In order to prove programs, we use Jessie [5], 
the deductive verification plug- in of C programs annotated with ACSL [6]. It 
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uses internally the languages and tools of the Why platform [7]. The Jessie 
plug-in uses Hoare-style [8] weakest precondition computations to formally prove 
ACSL properties. The generated verification conditions (VC) can be submitted 
to external automatic provers such as Simplify [9], Alt-Ergo [10], Z3 [11], CVC3 
[12]. 

These automatic provers belong to SMT (Satisfiability Modulo Theories) 
solvers. The SMT problem is a decision problem for logical formulas with re- 
spect to combinations of background theories expressed in classical first-order 
logic with equality. First-order logic is undecidable. Due to this high computa- 
tional difficulty, it is not possible to build a procedure that can solve arbitrary 
SMT problems. Therefore, most procedures focus on the more realistic goal of 
efficiently solving problems that occur in practice. 

For more complex situations, interactive theorem provers can be used to 
establish the validity of VCs, like Coq [13], PVS [14], Isabelle/HOL [15], etc. For 
our purpose, we used Coq (see section 4.2) since it is the one best known to the 
authors. 

3 The Conjugate Function 

In this section, the basics of algebraic combinatorics are given so that the reader 
can understand what is actually proved. Interestingly in this field, though the 
interpretation of what is actually computed can be of a very abstract algebraic 
level, the computation itself boils down most of the time to possibly intricate 
but rather elementary manipulations. 

3.1 Combinatorial and Algebraic Background: Integer Partitions 

A partition of a positive integer n is a way of writing n as a sum of a non- 
increasing sequence of integers. For example A = (4, 2, 2, 1) and fi = (2, 1) are 
partitions of n = 9 and n' = 3 respectively. We write |A| = n and = n' [16]. 

The Ferrers diagram F x associated to a partition A = (Ai, A 2 , X p ) 
consists of |A| = n boxes, arranged in = p left -justified rows of lengths 
Ai, A2, A p . Rows in F x are oriented downwards (or upwards for some au- 
thors). F x is called the shape of A. 

Definition 1. The conjugate of an integer partition is the partition associated 
to the diagonal symmetric of its shape. 

For example, for A = (3, 2, 1, 1, 1), here is the Ferrers diagram F x and the Ferrers 
diagram of the conjugate partition: 
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So the conjugate partition of (3, 2, 1, 1, 1) is (5, 2, 1). 

A semi-standard Young tableau of shape A is a numbering of the boxes 
of F x with entries from {1, 2, n}, weakly increasing across rows and strictly 
increasing down columns. A tableau is standard if and only if each entry appears 
only once. Here is an example of shape (4, 2, 2, 1) tableau: 
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A symmetric function of a set of variables {x\, X2, ■ ■ •} is a function 
f(xi,x 2 , . . .) of those variables which is invariant under any permutation of those 
variables (that is for example f(x\, x 2 , ■ ■ ■) = f(x2,x\,...)). This definition is 
usually restricted to polynomial functions. The most important linear basis of 
symmetric function's algebra is called the Schur functions and they are com- 
binatorially defined as follows: for a given semi-standard Young tableau T of 
shape A, write x T the product of the Xi for all i appearing in the tableau. Then 
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where Tab(A) is the set of all tableaux of shape A. We will note s\(x), s\. For 
example, consider the tableaux of shape (2, 1) using just three variables x\, x 2 , x 3 : 
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The associated Schur function is therefore: 



(2) 



s (2i){ x i,x 2 ,x 3 ) = x\x 2 + x\x 3 + x\x z + 2x x x 2 x 3 + X\x\ + x x x\ + x 2 x\ 
thus: 

S(2i) = S(2i)(zi, £2,2:3) + S(2i)(a;i,a;2, ^3,0:4) + • • • 

Note that, with this combinatorial definition, the symmetry of S( 2 i)(a;i, x 2 , £3) 
is not exactly obvious. 

We need to recall some well-known results in symmetric function theory: 
though Schur functions have historically been defined by Jacobi [17], they were 
named in the honor of Schur who discovered their crucial role in the representa- 
tion theory of the symmetric group and the general linear group. Namely, after 
the discovery by Frobenius that the irreducible representation of the symmetric 
groups are indexed by integer partitions, Schur showed that those functions can 
be interpreted as characters of those irreducible representation, and by Schur- 
Weyl duality characters of Lie groups and Lie algebras. Notably we obtain the 
representation of the general linear groups (GL n ) and unitary groups (U n ) [18] 
from the symmetric group representations. In this setting, the conjugate of the 
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partition essentially encodes the tensor product of a representation by the sign 
representation. 

Further work by Schur-Littlewood involve infinite sum of Schur functions 
associated to partitions [19], whose conjugates have a particular form. In partic- 
ular, these series are used to obtain symplectic (Sp2 n ) and orthogonal character 
groups (O n ) (symmetric and orthogonal Schur functions) from standard Schur 
functions [20]. 

One particularly important and difficult computational problem here is 
plethysm (see SCHUR reference manual [1] and [2]). It is the analogue in sym- 
metric functions of the substitution of polynomial inside another polynomial 
f(x) i — y f(g(x)). It is called plethysm because by some combinatorial explosion, 
it involves very quickly a lot (a plethora) of terms, making it something very 
difficult to compute efficiently. For example, S(2i)(s(2i)), the first example with 
non trivial partitions in the input is already very hard to compute by hand. First 
we can regard S( 2 i) as a function in as many monomial as in (2): 

S(21)(S(21))(£l, X 2 , X 3 ) = S(2\){x\x2, xjx 3 , x\x 3l X\X 2 Xs, X!X 2 X 3 , X\xl,XiX%, £2X3) 

it can be shown that the following holds: 

s (21)( s (21)) = s (22221) + s (321111) + 2 S( 32 2H) + S(3222) + s (33111) + 
3S( 33 21) + 2S(421H) + 3S(4221) + 3S( 43 H) + 3S( 432 ) + 
S(441) + S(511H) + 2S(52H) + S( 522 ) + 2s( 531) + S( 54 ) + S( 6 21) 

3.2 Computation in Algebraic Combinatorics 

Basically, the architecture of a software for computing in algebraic combinatorics 
is composed of two parts: 

— a computer algebra kernel dealing with the bookkeeping of expressions and 
linear combinations (parsing, printing, collecting, Gaussian and Grocbncr 
elimination algorithm. . . ); 

— a very large bunch of small combinatorial functions which enumerate and 
manipulate the combinatorial data structures. 

In algebraic combinatorics software, for each basic combinatorial structure such 
as permutations or partitions, there are typically 50-200 different functions. Con- 
jugating a partition is a very good example of what those many functions do, 
that is surgery on lists of integers or lists of lists of integers or more advanced 
recursive structures like trees. . . In a basic computation, most of the time is 
spent mapping or iterating those functions on some sets of objects. But due to 
combinatorial explosion those sets can be very large so these functions must be 
very well optimized. 

3.3 Properties 

The definition of conjugate (diagonal symmetric of its partition shape) is easy to 
understand but may conduct to naive implementations that may be inefficient. 
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Let us suppose that we represent an integer partition by an integer array 
starting from 1. For example A = (3, 2, 1, 1, 1) gives t[l] = 3, t[2] = 2,... t[l(X)] = 
1. Recall that t\i] is non-increasing, that is t[i + 1] < t[i\. 

One way to compute the conjugate is to count boxes: in our previous example 
the first column of A had 4 boxes, the second had 3 etc. Therefore, to compute the 
number of boxes in a column j we need to know how many lines are longer than 
j. As a consequence, if t c is the array representing the conjugate, the following 
formula gives the value of the entries of the conjugate: 

tc{j] = \{i\l<i<l(\)At[i\>j}\ . 

Note that t c \j] = if j > t[l], so the previous expression must be computed only 
from j = 1 to j = t[i\. This last property will be one of our predicates used to 
check the correctness of loop invariants. 

3.4 SCHUR Implementation 

Here follows the code of the conjugate function extracted from the SCHUR 
software. We expanded type definitions (C "structs" and "typedefs") from the 
original code just to simplify the work of Frama-C and to make this part of code 
independent from the rest of the SCHUR software (getting rid of global variables 
and so on). 

#define MAX 100 

void conjgte (int A [MAX] , int B [MAX] ) { 
int i, parte = 1, edge = 0; 

while (A [parte] != 0) { 
edge = A [parte] ; 
do 

parte = parte + 1; 
while (A [parte] == edge); 
for (i = A [parte] +1; i <= edge; i++) 

B[i] = parte - 1; 

} 

y 

Note that this implementation is not naive (and not so easy to understand) 
but its time complexity is optimal (linear in the length of the partition). 

The algorithm is based on looking for the set of descents of the partition 5 . 
The do-while loop follows a "flat" portion of the partition (t[i] = t[i — 1]) until a 
descent is found. Next the for-loop assigns the values of the B array according to 
the flat portion. The following figure clarifies this: we have denoted partci the 
value of parte at the entrance of while loop. partc 2 is the value of parte after 

5 A descent is such that t[i] < t[i — 1] 
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the do-while loop. For clarity's sake we supposed A[partc 2 ] + 1 to be different 
from A[partci]. If we count boxes column by column to construct array B, it is 
clear that B[z]=partc 2 -1 for all A[partc 2 ]+1 < i < A[partcx]=edge. 



u u 
■p -p 



o 
■p 



f3 (3 (3 , , 



partci — > |^| ... [ 



partci+n 
partc 2 



4 The Formal Proof of the Conjugate Function 
4.1 Annotations 

In the following paragraphs we present the annotations added to the code. Note 
that this is the only additions made to it. First we have to specify the model of 
integers we want to deal with: 

#pragma JessielntegerModel (strict) 

This means that int types are modeled by integers with appropriate bounds, 
and for each arithmetic operation, it is mandatory to show that no overflow 
occurs. 

Next, we have to express in first-order logic what an integer partition (stored 
in an array) is: 

#define MAX 100 

/*@ predicate is_partition{L} (int t [] ) = 

(\forall integer i; 1 <= i < MAX ==> <= t [i] < (MAX-1)) && 
(\forall integer i,j; 1 <= i <=j < MAX ==> t[j] <= t [i] ) && 
t [MAX-1] ==0; 

*/ 

Note that annotations are coded in the C comments, starting with a @. The 
{L} term is the context (pre, post, etc.), we won't detail it here, see [5,6] for 
details. 
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The data structure (array of integers) comes from the way the SCHUR soft- 
ware represents integer partitions. is used as a mark of end of array, just like 
character strings in C. The MAX value comes from the original source code as 
well. The first line of the predicate is_partition expresses that we are able to 
compute the conjugate (if at least one element is greater than or equal to MAX- 
1, the conjugate will no be able to be stored in an array of size MAX-1 with 
the last element fixed to 0). From the source code it is expressed by an external 
simple test on t [1] , but expressing it like that simplifies automatic provers job. 
The second line of the predicate defines the non-increasing order. 

The following predicate is needed to express how we count blocs to compute 
the conjugate. It may be read as z equals the number of elements of partition 
t, whose indexes are included in {1, .., j — 1} and whose values are greater than 
or equal to k. It is theoretically possible to express it as an axiomatic theory a 
kind of function, but automatic provers we use make a better use of predicates. 
Note that we need to explicit the z — case, in order to be able to prove the 
global post-condition is_conjugate(A,B). 

/*@ predicate countlf Sup{L} (int t [] , integer j, integer k, integer z)= 
is_partition{L}(t) && 
1<= j <= MAX && 
1<= k < MAX && 

((K=z<j && \forall integer i ; K=i<=z ==> t [i] >= k) 
II (z==0 && \forall integer i ; K=i<j ==> t[i]<k)) ; 

*/ 

Here is what we want to obtain at the end of the computation, t2 is a con- 
jugate of tl if the following holds: 

/*@ predicate is_conjugate{L} (int tl[], int t2[]) = 

\forall integer k ; K=k<MAX ==> countlf Sup (tl , MAX, k,t2 [k] ) ; 

*/ 

Finally, here is the function. First we have to give precise requirements on the 
inputs. For example, (\valid(A+ (1 . . (MAX-1) ) ) means that memory has been 
allocated so array indexes from 1 to MAX-1 are allowed). From the original code, 
the B array is supposed to be initialized with zeros before calling the function. 
This is translated into a requires directive. Next, we specify which memory 
elements are modified by the function (assigns). This is used for safety proofs. 
In the end, the output is correct if the post-condition (ensures) is met. 

/*<§ requires \valid(A+ (1 .. (MAX-1) )) ; 
requires \valid(B+ (1 .. (MAX-1) )) ; 
requires is_partition(A) ; 

requires \forall integer k; K=k<MAX ==> B [k] == 0; 
assigns B[l. .A[l]] ; 
ensures is_conjugate(A,B) ; 
*/ 
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void conjgte (int A [MAX] , int B [MAX] ) 
{ 

int i, partc=l, edge = ; 

Now we have to define the loop variant and invariant for each loop (to prove 
properties). The "loop variant" must decrease, while remaining non negative, to 
be able to prove termination. We also use a "ghost variable" to store the state 
of a variable before any modification. 

/*<§ loop variant MAX-partc; 

loop invariant K=partc<MAX; 

loop assigns B[1..A[1]]; 

loop invariant \forall integer k; 

A [parte] +1 <=k <= A[l] ==> countlf Sup(A,MAX,k,B [k] ) ; 

*/ 

while (A [parte] != 0) { 
edge = A [parte] ; 

/*@ ghost int old_partc = parte; */ 

/*@ loop variant MAX-partc; 

loop invariant old_partc<=partc ; 
loop invariant \forall integer k; 

old_partc<= k <= parte ==> A[k]==edge; 
loop invariant partc<MAX-l; 

*/ 
do 

parte = parte + 1; 
while (A [parte] == edge); 

We also use the assert directive to have a verification point of a property 
that may help automatic provers for the next properties or global ones. 

/*@ assert countlf Sup(A, parte, edge, partc-1) ;*/ 

/*@ loop variant edge-i; 

loop invariant i >= A [parte] +1 && edge+l>=i ; 
loop invariant \forall integer k; 

A [parte] +1 <=k <i ==> countlf Sup (A, MAX, k,B [k] ) ; 
loop assigns B[ (A [parte] +1) .. edge] ; 

*/ 

for (i = A [parte] +1; i <= edge; i++) 
B[i] = parte - 1; 

} 

} 
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Fig. 1. Graphical Interface: default behavior 



4.2 Proofs 

The figures 1 to 3 are snapshots of gWhy (Frama-c graphical interface when 
using plugin Jessie). We applied this tool on the previous annotated code. 

The Verification Conditions (VC, also called proof obligations) that have to 
be proved one by one (line by line) appear to the left of each of the following 
snapshots. In the upper right part of the window, we can check at a glance what 
hypotheses are known and what is to be proved at the bottom of it (under the 
line). No circularity paradox is possible here, since the proof of a VC can only 
rely on other VC higher in the control-flow graph of the function. 

In the lower right part of the window, the corresponding part of the annota- 
tion is highlighted in the source code with some lines before and after it. 

We will now focus on the VC part, to the left. We can see (green) dots 
meaning that this property has been proved by this prover. There is also (blue) 
rhombus with a question mark inside (see assertion 13), indicating that this 
prover will not be able to to prove this property. Actually, this does not mean 
that this VC is wrong, remember that these provers use heuristics. Sometimes, 
you may see scissors meaning that the maximum execution time has been reached 
without proving the VC. Again, this does not mean that the corresponding VC 
is wrong. Finally, at the top of a column a (green) check or (red, with a white 
cross inside) point is shown. The first one means that all properties have been 
proved by that prover. In fig. 1, The (blue) arrow at the top of the CVC3 column 
means that it is still computing some unshown VC (greater than number 16). 

The last figure is the final part. The provers have worked on the safety of the 
code, that is to say, integer bounds (overflow problems), pointer referencing and 
termination. 
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Fig. 2. Graphical Interface continued 



As seen in section 4.1, the B array has to be initialized with ze- 
ros before calling the function. This requirement has been enlightened 
thanks to the annotations and tools, in particular because without the 
line requires \forall integer k;K=k<MAX ==> B [k] ==0, the postcondition 
which states that B is a conjugate of A cannot be proved. 

We have also used Coq proof assistant. However, it not being essential to our 
present point, we chose to live aside the detail of this procedure (see section 4.3). 



4.3 Problems, Mistakes 

As usual when using formal proof tools, there are several ways to formalize or 
to annotate programs. Choices made during at this stage are very important for 
future proofs. For example, declaring a function as an axiomatic theory or as 
a predicate will suppose corresponding proofs to be different. We can make a 
similar remark with data-types used in programs. 

For these reasons, using "good" annotations which allows automatic provers 
to prove verification conditions (VC) successfully is a clever way to go about it. 

When we deal with 40,000 lines of undocumented code, another critical part 
of the work consists in "correctly" isolating the piece of code that we want to 
prove. The code can use global variables, initializations made by other functions, 
or use intricate data-types and so on. 

In the following paragraphs, these problems and associated mistakes are dis- 
cussed. 



Isolating a Part of Program. Generally speaking, the analyzed function must 
be free of external calls. More precisely if a function is called from it, it has to 
be incorporated in the code (like macro expansion) or, at least, independently 
proved. 
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File Configuration Proof 



Proof obligations 


Alt-Ergo Simplify 
0.9 1.5.4 


Z3 CVC3 
2.4 2009101! 
[SS) CSS] 


resultl" int32 

H13: resultl = select[int_P_int_M_A_5, shift (A 
integer_of_int32(partc0) ) ) 






Function conjgte 
5afety 


<# 






H 14 : in tege r of in t _-: / 1_ r i.i l.t 1 ) <> B 

H15: □ffset_min(int_P_A_5_alloc_table, A) <= 

integer_cf_int32(partcB) and 

integer of int32[partc0) <= offset max 
(int_P_A_5_alloc_table, A) 
result2: int32 

HIS; result2 = select(int_P_int_M_A_5, shift[A 
integer_of_int32(partcG) ) ) 






L. 


pointer dereferencing 












2. 


pointer dereferencing 












3. 


check arithmetic overflow 
















5. 


pointer dereferencing 








edgeS: int32 

H17: edgeB = result2 

old parte : int32 






5. 


pointer dereferencing 












7. 


variant decrease 








H LB : old_partc = partcB 
partel: int32 
HI3; true 






S. 


variant decrease 












9. 


check arithmetic overflow 








;H20: integer of int32[partcl) < 106 - L and 
[forall k_l:int. 

integer_of_int32(old_partc) <= k_L and k_ 
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check arithmetic overflow 








1 <= 




LL 


check arithmetic overflow 








in teger_of_int32( partel] -> 

in teger_of_int32 [select (in t_P_int_M_A_5, 
[A, k_l))) = integer_of_int32(edgeB] ] and 


shift 
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check arithmetic overflow 
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pointer dereferencing 








integer of int32[old parte) <= integer of 
[partel) 


int32 
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pointer dereferencing 












L5 


check arithmetic overflow 








integer_of_Lnt32(partcl) + 1 <= 2147433647 






L6 


check arithmetic overflow 










L7 


variant decrease 








loop invariant \forall integer kj 
old_partc<- k <- parte --> A[k]--edge; 


1 




LB 


variant decrease 








loop invariant partc<MAX- 1; 






19 


variant decrease 


O O 






V 
do 


1 




2B 


variant decrease 


• • 






parte = parte + 1; 














while (A[partc] == edge); 


| 


Tir 


necut45 [*]□ Pretty Printer 


file: conjLigate_pred2.c VC: check arithmetic overflow 


A 



Fig. 3. Graphical Interface: Safety 



Next, data types must be simplified. Even if Frama-C can cope with simple 
structures, it is better to have a first pass on them (unions suppression, typedef 
expansion and so on). 



How to Make Good Annotations? As previously explained, ACLS is a 
language which is used to annotate C programs. Annotating an existing program 
consists in choosing properties (comportment, results,...) that the user wants to 
be "confirmed", such as preconditions, loop invariants, post conditions. In our 
case, for example, one of the most relevant properties we proved is that the result 
B is the conjugate of the partition A. This property is stated as a postcondition. 

As usual, there are several ways to formalize annotations. Particularly when 
using external provers, a good method is to know how provers work. Here, we 
have to remember that the automatic provers are SMT solvers (see section 2.2). 

As an example, we can give the definition of countlf Sup. In a first formal- 
ization we wrote it as an "axiomatization" . But due to another problem that we 
will describe in the next paragraph, we needed to make some proofs in Coq which 
used countlf Sup. Then, to make it easier for Coq, we decided to try to define 
it inductively. Thanks to this other definition, some conditions were automati- 
cally proved by SMT solvers. This example shows how important formalization 
choices can be. 

In the next paragraph, we will explain and illustrate how Coq allowed us to 
correct some errors in our annotations. 
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Why Coq? Once annotations are completed, the method consists in using 
automatic provers (using gWhy for example). As previously explained, if all 
proof obligations are proved by at least one prover, the work can be considered 
as finished. But, if one or more proof obligations is/are still unproved, several 
approaches are possible: the first one consists in verifying that annotations are 
"sufficient", that is to say a precondition or a loop invariant is not missing. 
Another approach, when the user suppose that his annotations are correct, is to 
use an external non automatic prover to try to prove proof obligations that have 
not been verified previously. 

In our case, we used the interactive theorem prover Coq twice. The first time 
was because a postcondition had not been proved by SMT provers. When we be- 
gan Coq proof, we discovered that the definition of countlf Sup was incomplete: 
the second part of the "I I " (logical or) was missing. 

The second time we used Coq was to prove a loop invariant. Similarly we 
detected another incompleteness in countlfSup definition (j < MAX instead 
of j < MAX). Proof assistants are well adapted to detect this kind of problems. 
Indeed, building formal proofs manually, a user can easily see which hypotheses 
are necessary. 

After having corrected and replaced the "axiomatization" of countlfSup by 
a predicate, all proof obligations have been proved by at least one automatic 
prover. 

Note that the new definition allowed us to remove from the annotations one 
additional lemma which, at first, appeared necessary. 

Other Vicissitudes. Among the main encountered difficulties, we can mention 
the confidence in the provers we used. In our case, one of the versions of CVC3 
was faulty and proved all VC correct, even when they were false. For this reason 
we decided to consider that a proof obligation was proved when at least two 
automatic provers succeed on proving it. It is the case for all our obligations 
except one (VC # 23 is only proved by Simplify). The proof of VC # 23 is in 
progress using Coq. 

5 Conclusion and Future Work 

We have isolated and formally proved one of the key commands of the SCHUR 
software. This work reinforced us in the idea of formally proving chosen parts of 
software of the same kind, composed of 40,000 lines of undocumented code. 

Thanks to this approach, we have focused on critical points (such as par- 
ticular initializations of arrays and appropriate bounds) from the original code 
and by extension, we have understood the progression axis of the methodology. 
In particular, it is better to know how SMT automatic provers work to try to 
make a "good" annotation so that obligation proofs will be more easily proved 
by them. In the methodology, non automatic external provers like Coq may be 
used to refine annotations, and to prove obligations when no automatic provers 
succeed. 
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The conjugate function is a basic brick of combinatorics. This give us perspec- 
tive to prove other functions. Therefore, as a future work, the second step is to 
prove algorithms relying on exhaustive enumeration algorithm, such as compu- 
tation of Littlewood-Richardson coefficients, Koskas numbers, Koskas matrices, 
representation multiplicity in tensor product decompositions, etc. 

The final objective will be to build proved libraries usable for scientific com- 
munity. 
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